[Bugfix] Support Qwen3-MOE on aclgraph mode#1381
Conversation
9fcfc91 to
61eafae
Compare
61eafae to
3da51df
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1381 +/- ##
===========================================
+ Coverage 27.39% 52.34% +24.95%
===========================================
Files 56 78 +22
Lines 6191 9641 +3450
===========================================
+ Hits 1696 5047 +3351
- Misses 4495 4594 +99
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@ApsarasX Have you evaluated various combinations of |
|
@ApsarasX Thanks for your PR, I tried Qwen/Qwen3-30B-A3B on main branch, and Issue also exist on main branch. Run mode: Run partitial log: Send request: request return: This original error information for Qwen/Qwen3-30B-A3B as following, the same with #1368 |
|
@ApsarasX is it ready to merge? Any idea about #1368 (comment)? |
I don't think they are the same error, should be irrelevant. |
|
The @ApsarasX , could you please rebase and add me as a co-author? Thank you. |
3da51df to
477d5d1
Compare
I have added you as a co-author. Could you please handle these corner cases in the future. |
Yeah, already on my schedule. |
PR ready, please merge |
|
@Yikun Please review |
|
please add e2e test for qwen3-moe as well |
|
You can add the model test on https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/singlecard/test_aclgraph.py#L32 By running the reduce layer model: |
Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Signed-off-by: ApsarasX <apsarax@outlook.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
477d5d1 to
25f1182
Compare
|
Do a double confrim on: And added a e2e test for qwen aclgraph case. LGTM Thanks all @ApsarasX @yiz-liu @leo-pony @wangxiyuan |
### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: nanxing <1014662416@qq.com>
What this PR does / why we need it?
Fix the shape of the
npu_moe_init_routinginput parameters to support aclgraph mode on qwen3-moeIn addition to this PR, resolving the
gatherv3error might be necessary. See related PR #1297 #1446Thanks to @yiz-liu for providing the idea
Does this PR introduce any user-facing change?
No
How was this patch tested?
Tested on Qwen3-30B-A3B
Closes: #1368